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The assessment of change in affective behaviors has 
become an importaiit concern of educators, particularly in 
instsnees where svidence is sought regarding the effects of 
planned interventions on affective outcomes . In areas of 
research on self-concept in particular, several studies 
have attempted to construct a developmental picture of this 
aspect of human behavior (e.g. , Abbele, 1967; Carpenter & 
Busse, 1969; Stanwyck & Felker, 1971) for the purpose of 
understanding further the hature of the growth or change 
process. Whether or not these studies adequately addressed 
the issue of age changes in self-concept remains an open . 
question due to the use of cross-sectional research method- 
ology. However, longitudinal research also has been conducted 
in this regard (Felker, 19 72,, 1976,' O'Malley and Bachman, 1976 
Stanwyck, 1972) . 
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Unfortunately^ a review of eKisting research literature 
on self-eoncept reveals InconBistencies and wide--ranging dif-- 
fet^neas in the findings reported. It may be that comparisons 
among studies have suffered most because of differences in 
the instruments used to measure the relevant behaviors^ lack 
of (aaf initional clarity in the x^oncepts studied, as well as 
in the sampling procedures employed^ and even procedural dif^ 
ferences in testing conditions* 

From a pBychometric perspective / still another problem 
arises in the interpretation of research findings derived from 
the use of repeated measurements from a specified population 
of test-^takers. Addressing this problem, Anderson (1976) poBec 
the question as to whether growth or change scores on a par-' 
tioular affective measure could be Interpreted as reflecting 
behavioral changes on the variable (s) of interest among the 
individuals tested^ or whether the test items themselves 
undergo some change over time. That is, Anderson suggested 
that the use of traditional measurement models may confound 
behavioral changes among persons with changes over time in 
the item characteristics themselves. What would be useful, 
then, in order to measure behavioral change on some psycho- 
logical construct would be a set of items whose psychometric 
properties remain invariant over time. 

Coupled with the- measurement concerns expressed above 
is the problem of generalizability of research findings _ 
derived from an experimentally accessible population to some 
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theoretically larger target population. Becatise such 
ganeralizations require thorough knowledge of the charac^ 
teristics of both the sampled a^id intended populations 
(information that may be either unavailable or unknown to 
a researcher)^ generalizability remains a major obstacle in 
behavioral research* On this issue^ Rasch (1960) noted the 
strong interdependence between statistical tools and charao-- 
teristics of the particular sample of persons selected for 
study. When traditional models of measurement are used^ 
Rasch demonstrated that the psychometric properties of tests 
are not specific to the instruments themselves and may vary 
markedly with the sample studied, Thus^ an individual's 
score on a test is largely dependent for its meaning upon a 
particular set of items and a particular sample of test- 
takers* 

During the last decade/ the topic of latent trait models 
has received the attention of measurement specialists as a 
means of improving educational assessment practices* The 
particular model advanced by Rasch (1960) has been described 
as providing individual measurements of behavior that are 
independent of either the sample of persons from whom the 
measurements were obtained or the particular set of items 
used to measure a given behavior- Moreover / it has been 
claimed that/ if an instrument can be demonstrated as fitting 
the model, any subset of calibrated items will provide compa- 
rable measures of the behavior in question. Thus, Rasch has 



suggested that an instrument possessing the general charac- 
teristics of his measurement model would become analogous to 
a yardstick used to measure the length of physical objects. 

The purpose of this study was to eKamine the usef ulness 
of the Raach logistic measurement model for longitudinal 
research on change in affective behaviors of children. Specif-- 
ically, evidence was sought regarding the degree to which 

the Rasch modal claims were subetantiated in the measurement 

: . \ ■ . . . . 

of affective behavioral outcomes* In testing the claims of 

the model, special attention was given to the issue of item 

subset equivalence* 

EKaminee Population 

The primary sample for this study consisted of 1,927 
elementary--school children for whom measures of self^concapt 
had been obtained during September 1972, May, 1973, and 
September, 1973/ as part of a longitudinal study of children's 
self^concept development (Felker, 1972) * Testing was conducted 
under classroom conditions using four schools in northwestern 
Indiana. 
Instrumentation 

The Piers-Harris Self^Concept Scale (Piers s Harris, 1964) 
was used in the present study as the measure of self ^concept . 
In its original term, the scale includes 80 declarative stat§* 
ments, originally developed from Jersild ' s (19S2) categories, 
ai>d requires the examinee to respond either yes or no on the 

"o""-- .y ■ ■ ■■ y-y:'yy/.:jyy'y■■:■'-■^' 
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basis of whather or not each statement is congruent with 
the examinee's parception of him or herself* The authors 
reported that the instrument was developed as a measure of 
general self^concept . However/ a recent review o£ research 
on the instrument (Shavelson/ Hubnar & Stanton ^ 1976) sug- 
geste the presence of several different dimensions^ with 
general self-^concept (total scores on the scale) possibly 
reflecting a relatively enduring characteristic of the 
individual test-^taker *. 
Design 

The teat responses obtained during Septentoer/ 1972/ from 
a random sample (without replacement) of 1/000 subjects drawn 
from the total examinee population were used to calibrate the 
Piers^Harris Scale* Because calibration of an instrument 
rarely is accomplished in a single computer run ^ a series of 
analyses was required in order to produce a final set of 
items whose properties satisfied the assumptions underlying 
the application of the Rasch measurement model. The CMiFIT 
computer algorithm (Wright & Mead, 1975) was used to calibrate 
the scale and/ hence, to estimate the Rasch person and item 
parameters eKplicit in the measurement model* 

Following the sequence of test calibrations noted above/ 
a final set of 25 items (from an original pool of 80) that 
fit the Rasch formulation was obtained. To test the model 's 
claim that any subset of items from a calibrated pool of items 
may be used to provide comparable measures of the construct 
in question/ subsets of 16 items each were drawn randomly 
(without replacement) from' the 2S calibrated items. The 



determination of the number of itemB to be included in these 
subteste (k ^ 16) was based upon the results of a study conduc 
ted by Garrison (1976) , indicating that this number of items 
was necessary to establish an average stability coefficient of 
•65 between testing times. 

In order to eKamine the effectiveness of an intervention 
program (Felker, 1972) on the development of individuals' 
self-concept; the test data were analyzed using a repeated 
measures analysis of variance design (Winer, 1971) . Analyses 
were performed separately for each sex and consisted of two 
between (or crossed) factors (experimental vs. control treat- 
ment groups and grade level of respondents) and one within 
.(or nested) factor (time of testing) v Thus, the dependent 
measures for the analyses consisted of total scores obtained 
by examinees on. each of the following instrument forms r (1) 

the 80--item original Piers^Harris Scale; (2) a 25^item instru^ 

* 

ment composed of items fitting the measurement model and (3) 
fifteen (15) IS-item teste composed of items drawn randomly 
from the set of 25 calibrated items. The results of each of 
the repeated measures analyses of variance were compdred to 
determine whether self ^concept changes over time were mani- 
fested consistently for each of the differing test formats 
studied. 



RESULTS 

Using the response data obtained from 924 male and female 
pupils in grades 2-5 over a Qne-^year period^ repeated measures 
analyses of varianGe were used to eKamine the consistency of 
self--concept changes manifested over time when the test stimuli 
were manipulated deliberately* In order to avoid confounding 
of sex differences in self-concept development with the more 
meaningful differences under investigation in this study / the 
data obtained from males and femaies were analyzed separately* 
Table 1 suiwnarizes the results of analyses using scales composed 
of 80/ 25/ and 16--items for the male eKaminees* Except for the 
mean square error termS/ the cell entries represent F^ratios* 

Insert Table 1 about here 



The analyses performed on both the BO^item and 25--item. 
sets utilised identiaal instruments/ respectively/ over three 
different times of measurement. However/ it must be noted 
that each of the Is-item sets differed from one time of meas- 
urement to another. The variation among the IS^item seales 
was introduced in order to test the Rasch claim that "any" 
set of calibrated items may be used to measure a person's 
position along some lahent 

An examination of the data provided in Table 1 indicates 
that the significant effects observed for the SO^item self- 
concept instrument general continued to be manifested with 
the use of 25 and 16 items. In partiaular/ the main effect 



due to the eKparimental treatment was coneistent. Upon 
further examination # it was found also that# for each compar- 
ison of the treatment .means, the eKperimental group scored 
significantly higher than did the control group of reEpondents 
For all analyses/ total test scores were expressed in the 
Rasch log metric. 

The T (time of measurement) main effect also showed 
consistency among the analyses presented in Table 1 and% " ^ ^ 
generally, reflected an upward movement from Time 1 to Time 3 
in terms of mean self --concept • In the case of the traatment 
by time of measurement interaction (A x T) , the findings were 
not as clearly interpretable* That is /the significant intern- 
action observed from the 80- item data was manifested in only 
50% of the analyses performed using fewer items. 

The results of analyses of variance performed on the 
self ^concept data for the female examinees are presented in 
Table 2- 



insert Table 2 about here 



As was found for the results of analyses performed using 
the male respondents, both the A and T main effects tended to 
show consistency over the differing test formats* However^, 
the A X B and A x B x T interactions observed for the 80-item 
data set consistently failed to ap:^ear in subsequent analyses 
using fewer items. While the reason for this finding was not 
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clear, it was epeculated that the reduction in total test 
score varianca for the 16-item subtests (attributable to 
test length and similar itein discrimination indices) may 
have accounted for the "loss of power" in uncovering these 
subtle interactions, ..^.^^ 

A somewhat different pat'tern of results than those 
described above was obtained from an examination of the data 
contained in Table 3, 



Insert Table 3 aboub here 

Table 3 details the proportion of total varianoe attributable 
to eaeh of the experimental effects reaching statistical sig- 
nificance for the test formats studied. While the differences 
in the proportions reported are probably too small to be 
theoretically important, it is interesting, nonetheless, to 
note the effects of reducing the total number of test stimuli, 
on the resulting proportions. 

For the male respondents, reducing the number of test 
items to 25 and 16, respectively, resulted in some gain in the 
proportion of variance attributable to the experimental inter- 
vention. However, whatever gain resulted for the male respon- 
dents was offset by a decrease in the same proportion of variance 
attributable to the experimental treatment among the female 
respondents. Thus, the effects of fitting a set of items to 
the Rasch logistic model resulted in some efficiency and 
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consistenGy of the changes manifested in males over time, 
while the same conclusion could be drawn only tenuously for 
the female population of respondents, 

DISCUSSION AND CONCLUSIONS 
The present study was conducted to examine the usefulness 
of the Rasch logistic measurement model in developmental studie 
of children's self-concept. A calibration analysis of the 
Piers-Harris Self-Concept Scale (Piers & Itorris, 1964) resulted 
in the identification of a set of 25 test items that fit the as 
sumptions of the model (Garrison, 1976) . In fitting the model 
to the data, it was observed that a large proportion of the 
total number of items contained in the intact instrument were 
discarded. However, the reductions in number of test stimuli 

. . ^ .... . . . . . . .... 

used to assess growth or change along an affective dimension 
did not appear to alter markedly the conclusions drawn from 
analyses of data collected using the larger item pool. Thus, 
if one is willing to accept slightly less control in an experi- 
mental context, it appears reasonable that Rasch calibration 
procedures may be useful in constructing tests which are uni- 
dimensional and considerably shorter in over-all length than 
those typically utilized in psychological research. More 
importantly, the unidimensional nature of Rasch cali' rated 
tests (Rasch, 1960; Wright & Mead, 1975) may well serve to 
clarify the nature of the construct being measured. 

Within the limits of statistical probability, then, the 
Kasch model was found to be useful in reducing the number of 
items required to measure growth or change along selected 

11 



payohological charaeteristiaB • PurtherTOre, as Anderson (1976) 
noted, if one la to aecurataly estimate change among Individ - 
uals along an underlying construct^ then it is necessary to 
develop instruments with meaningful units. However, the 
assertion that "any" set of calibrated Items may be used to 
measure a person's position along some latent continuum repre-- 
sents a significant departure from the traditional approach to 
equating instriiments. Yet, if it can be determined that item 
sets that have not been matched for difficulty proyide the _ 
same information about persons as do instruments that have been 
equated using traditional procedures, then the classical test 
theory requirements for equating instruments (l,e,, equal item 
difficulties, discriminations, means and variances) may bo un- 
necessary. 

In conclusion, research on the application of the Rasch 
measurement model has been limited primarily to the measurement 
of intelligence or achievement^related outcomes* Yet, an 
examination of the conditions specified by Wright and Mead 
(1975) for the use of the model siiggests that it may have util^ 
ity also in the measurement of aff active behaviors. Future 
research on the validity of the model may seek to pursue evidence 
for the model's claims along longitudinal avenues* Whereas much 
attention has now been devoted to the robustness of the model's 
under lyi^^^^as sumptions, much more effort must be expended In 
establishing the model ' s utility within an experimental conteKt • ; 
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Buiimiry of Analysii of Variance Besulte for Male 
- Silf-CBEeept Data (n = !}9'^) , " 



Source of 
: Variation 


Full 
Scale 


Calib; ' Mtm 
Scale 1 


. ■ Eandoffl 

II- " 


RandeiB 
.^III 


Random ' . Baadoi 

, lY V , 


' (k=80) 











Mtmm Subjects 

freatoint Groups (A) 1 

Grade (B) ' 3 

A%B ■■ 3 

Error Mean Square M 



2.928* 11.1^1*3** 
.209 .359 
1.36?- .939 ■ 



T.163** lO.QliB** 10.282** T.iSP 6.5t9*» 

.252 .m \ ..232 1.1*58 M 
.556 IM 1.359 .^58 .T5l» 

2.603 2M 2,589 -2.503 2.1H3^'' 



Within Subjicts . 
■ : Tlie of Measurtnieat 

B X T 
AxBxT 
- Error Mean Square . 



) 2 ik.kW^ iLm'^ 

2 3.105** 2.39T* 

6 .B2T - -.552 

6 . l.Oltl .910 

9T2 .M ' -5^0 



3.1T1** lO.pOli** 13 

-.^k] 3.1T5** 2 

■,985 .^00 

1.311 .568 

.512 ' .51)6 



§n i2.85if**- 13.805** 

0^2 . 1.85^ 

653 ' .50T 1.93t* 

839 1*505 1.962* 

55! .561' .550' 



Eote: Except for error mean square terns, the cfll entries represent F-ratios. 



■ Table 2 

Sunmiry of Anilyiii of Variance Rtsults for Fiinsle 
Self 'Concept Data (n=lj30) 
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' 3.221* 


3.288* 


.986 


3.392* 


3 




.922 


.888 


.m 
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.619 


1.131- 


3 


2.311* 


1.28? 


1.3^9 


1.32i3. 


1.091 


1.586 


. M 


1|22 
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3.301 
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2.93^ 


2.905 
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A X T 
Bxf 
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If ror Mean Square 



,586 ■ 
■ 6-.. .Till- 
■6 2.065* 
8W ■■..339 ■ 



2.55^** 
.2.1^7** 

1.088,. 
■.516 



6.851** EO.T^Jtf* i6.69ii** 10.120»« 19.9W** 

.126 3.033** 1.3T0 .538 Ml 

-..5l6-.^^.J^^^^ ,1,015... 1.1T5. 
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Me: Jxeapt -for error lean' ipire^ terms j the eell entries, repreienf F-ritios, 



p.< .10 
p.<..05 
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E tc 
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Proportion of Total Variance 
Attributable to Significant Ef facta for Differing Test Formats 



Sex of Significant Effects Number of Items 

Raspondante Observed 80 25 16* 



MALES 





,0046 


.0165 


.0115 


T 


.0065 


.0092 


.ooes 


AT 


.0014 


.0013 


.0012 


Error 


.9776 


.9641 


.9704 



FEMALES 





.0110 


.0079 


.0047 


AB 


, .0126 


.0067 


.0061 


T 


.0070 


.0096 


.0095 


ABT 


.0029 


.0018 


.0016 


Error 


.9622 


.964 3 


.9694 



* Proportions appearing in this column represent averages based 
upon the analyses of variance of the 15 sets of 16 items ^ 
randomly drawn from the calibrated set of 25 items, ....... 



